2 research outputs found

    Convolutional Drift Networks for Video Classification

    Full text link
    Analyzing spatio-temporal data like video is a challenging task that requires processing visual and temporal information effectively. Convolutional Neural Networks have shown promise as baseline fixed feature extractors through transfer learning, a technique that helps minimize the training cost on visual information. Temporal information is often handled using hand-crafted features or Recurrent Neural Networks, but this can be overly specific or prohibitively complex. Building a fully trainable system that can efficiently analyze spatio-temporal data without hand-crafted features or complex training is an open challenge. We present a new neural network architecture to address this challenge, the Convolutional Drift Network (CDN). Our CDN architecture combines the visual feature extraction power of deep Convolutional Neural Networks with the intrinsically efficient temporal processing provided by Reservoir Computing. In this introductory paper on the CDN, we provide a very simple baseline implementation tested on two egocentric (first-person) video activity datasets.We achieve video-level activity classification results on-par with state-of-the art methods. Notably, performance on this complex spatio-temporal task was produced by only training a single feed-forward layer in the CDN.Comment: Published in IEEE Rebooting Computin

    Tapered-Precision Numerical Formats for Deep Learning Inference and Training

    No full text
    The demand to deploy deep learning models on edge devices has recently increased due to their pervasiveness, in applications ranging from healthcare to precision agriculture. However, a major challenge with current deep learning models, is their computational complexity. One approach to address this limitation is to compress the deep learning models by employing low-precision numerical formats. Such low-precision models often suffer from degraded inference or training accuracy. This lends itself to the question, which low-precision numerical format can meet the objective of high training accuracy with minimal resources? This research introduces tapered-precision numerical formats for deep learning inference and training. These formats have inherent capability to match the distribution of deep learning parameters by expressing values in unequal-magnitude spacing such that the density of values is maximum near zero and is tapered towards the maximum representable number. We develop low-precision arithmetic frameworks, that utilize tapered precision numerical formats to enhance the performance of deep learning inference and training. Further, we develop a software/hardware co-design framework to identify the right format for inference based on user-defined constraints through integer linear programming optimization. Third, novel adaptive low-precision algorithms are proposed that match the tapered-precision numerical format configuration to best represent the layerwise dynamic range and distribution of parameters within a deep learning model. Finally, a numerical analysis approach and signal-to-quantization-noise ratio equation for tapered-precision numerical formats are proposed that uses a metric to select the appropriate numerical format configuration. The efficacy of the proposed approaches is demonstrated on various benchmarks. Results assert that the accuracy and hardware cost trade-off of low-precision deep neural networks using tapered precision numerical formats outperform other well-known numerical formats, including floating point and fixed-point
    corecore